A Tensor-Variate Gaussian Process for Classification of Multidimensional Structured Data
نویسندگان
چکیده
As tensors provide a natural and efficient representation of multidimensional structured data, in this paper, we consider probabilistic multinomial probit classification for tensor-variate inputs with Gaussian processes (GP) priors placed over the latent function. In order to take into account the underlying multimodes structure information within the model, we propose a framework of probabilistic product kernels for tensorial data based on a generative model assumption. More specifically, it can be interpreted as mapping tensors to probability density function space and measuring similarity by an information divergence. Since tensor kernels enable us to model input tensor observations, the proposed tensor-variate GP is considered as both a generative and discriminative model. Furthermore, a fully variational Bayesian treatment for multiclass GP classification with multinomial probit likelihood is employed to estimate the hyperparameters and infer the predictive distributions. Simulation results on both synthetic data and a real world application of human action recognition in videos demonstrate the effectiveness and advantages of the proposed approach for classification of multiway tensor data, especially in the case that the underlying structure information among multimodes is discriminative for the classification task. Introduction Tensors (also called multiway arrays) are generalization of vectors and matrices to higher dimensions and are equipped with corresponding multilinear algebra. Development of theory and algorithms for tensor decompositions (factorizations) has been an active area of study within the past decade, see e.g. (Cichocki et al. 2009; Kolda and Bader 2009), and the methods have been successfully applied to problems in unsupervised learning and exploratory data analysis. Multiway analysis enables us to effectively capture the multilinear structure of the data, which is usually available as a priori information on the data nature. There is a growing need for the development and application of machine learning methods to analyze multidimensional data, such as functional magnetic resonance (fMRI), electrocorticography (ECoG), electroencephalography (EEG) data, Copyright c © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. and 3D video sequences, thus emphasizing the need to take the information on the structure of the original data into account. Tensors provide a natural and efficient way to describe such multidimensional structured data, and the corresponding learning methods can explicitly exploit the a priori information of data structure and capture the underlying multimode relations to achieve useful decompositions of the data with good generalization ability. Recent research has addressed extensions of the kernel concept into tensor decompositions (Signoretto, De Lathauwer, and Suykens 2011; Xu, Yan, and Qi 2012), aiming to bring together the desirable properties of kernel methods and tensor decompositions for significant performance gain when the data are structured and nonlinear dependencies among latent variables do exist. In (Xu, Yan, and Qi 2012), the nonlinear tensor decomposition problem is addressed by a Kronecker product of kernels that are obtained from different groups of vector inputs. In (Signoretto, De Lathauwer, and Suykens 2011), the Chordal distance-based kernel for tensorial data is introduced with rotation and reflection invariance on the Grassmann manifold. Gaussian process (GP) (Rasmussen and Williams 2006; Kersting and Xu 2009) is attractive for non-parametric probabilistic inference because knowledge can be specified directly in the prior distribution of latent function through the mean and covariance function. Inference can be achieved in a closed form for regression under a Gaussian likelihood, but approximation is necessary under non-Gaussian likelihoods. Gaussian process can be extended to binary classification problems by employing logistic or probit likelihoods (Nickisch and Rasmussen 2008), while multinomial logistic or multinomial probit likelihoods are employed in multiclass Gaussian process classification (Williams and Barber 1998; Chai 2012; Girolami and Rogers 2006). Since exact inference is analytically intractable for logistic and probit likelihoods, approximation inference is widely applied, such as Laplace approximation (Williams and Barber 1998), expectation propagation (Kim and Ghahramani 2006; Riihimäki, Jylänki, and Vehtari 2012) and variational approximation (Girolami and Rogers 2006). In this paper, we extend multiclass GP classification to a tensor variate input space in order to consider the multiway structure of inputs into the model learning and predictions, which is important and promising for multidimenProceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence
منابع مشابه
InfTucker: t-Process based Infinite Tensor Decomposition
Tensor decomposition is a powerful tool for multiway data analysis. Many popular tensor decomposition approaches—such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)—conduct multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g. missing data and binary data), and (iii) noisy observations and outliers. To ad...
متن کاملFast Laplace Approximation for Gaussian Processes with a Tensor Product Kernel
Gaussian processes provide a principled Bayesian framework, but direct implementations are restricted to small data sets due to the cubic time cost in the data size. In case the kernel function is expressible as a tensor product kernel and input data lies on a multidimensional grid it has been shown that the computational cost for Gaussian process regression can be reduced considerably. Tensor ...
متن کاملEfficient Algorithm for Sparse Tensor-variate Gaussian Graphical Models via Gradient Descent
We study the sparse tensor-variate Gaussian graphical model (STGGM), where each way of the tensor follows a multivariate normal distribution whose precision matrix has sparse structures. In order to estimate the precision matrices, we propose a sparsity constrained maximum likelihood estimator. However, due to the complex structure of the tensor-variate GGMs, the likelihood based estimator is n...
متن کاملNegative Selection Based Data Classification with Flexible Boundaries
One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...
متن کاملSolving Tensor Structured Problems with Computational Tensor Algebra
Since its introduction by Gauss, Matrix Algebra has facilitated understanding of scientific problems, hiding distracting details and finding more elegant and efficient ways of computational solving. Todays largest problems, which often originate from multidimensional data, might profit from even higher levels of abstraction. We developed a framework for solving tensor structured problems with t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013